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We study the necessity of predictive information in a class of queueing admission control problems, where 
a system manager is allowed to divert incoming jobs up to a fixed rate, in order to minimize the queueing 
de lay experienced by th e admitted jobs. 

Spencer et al.l (120141 1 show that the system’s delay performance can be significantly improved by having 


access to future information in the form of a lookahead window, during which the times of future arrivals and 
services are revealed. They prove that, while delay under an optimal online policy diverges to infinity in the 
he avy-traffic regime, it can stay bounded by making use of future information. However, the diversion polices 
( 2 OI 4 I I require the length of the lookahead window to grow to infinity at a non-trivial rate 


of 


Spencer et al 


in the heavy-traffic regime, and it remained open whether substantial performance improvement could still 
be achieved with less future information. 

We resolve this question to a large extent by establishing an asymptotically tight lower bound on how 
much future information is necessary to achieve superior performance, which matches the upper bound of 


Spencer et al.l (120141 1 up to a constant multiplicative factor. Our result hence demonstrates that the system’s 


heavy-traffic delay performance is highly sensitive to the amount of future information available. Our proof 
is based on analyzing certain excursion probabilities of the input sample paths, and exploiting a connection 
between a policy’s diversion decisions and subsequent server idling, which may be of independent interest 
for related dynamic resource allocation problems. 

Key words-, admission control, queueing, algorithm, future information, predictive model, heavy-traffic 
asymptotics 


1. Introduction 


Recently, there have been substantial interests in developing forecasting systems and predictive 
models across various application domains, which enable a system manager to obtain (partial) 
information of future inputs, and thus allow for more efficient decision making or resource alloca¬ 


tion. E xamples of these systems include advanced o rdering in supply chains (jFisher and Raman 


(1996)), appointment booking for ele ctive surgeries ( Kim and Horowita (2002 11. and mechanisms 


for predicting future hospital visits (jWargon et al. 
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Sun et al 


(120(3^ 1. Because acquiring 


accurate predictions can often involve additional infrastructural investments and operational com¬ 
plexities, it is a natural question to ask how useful such predictive information can be, in terms of 
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its ability in improving system performance beyond what can be achieved by the more conventional 
way of online decisi on making, which doe s not take predictive information into account. 


In a recent paper, 


Spencer et al.l (j20l4 ) initiated an investigation along this direction in a class 


of queueing admission control problems, illustrated in Figured) An overloaded queue with service 
rate l—p receives incoming jobs at rate Ag (1—p, 1), and the system manager is allowed to divert 
incoming jobs up to a rate of p, with the objective of minimizing the time-average queueing delay 
among the admitted jobs. The system manager has access to a lookahead window of length W\, 
within which the realizations of future arrivals and service availability are revealed. The online 


version of the problem, with W\ = 0, is a classic a 


contexts related to congestion control (jYechialil (Il97lh , IStidhamI (1 19851) ). 


queueing mode l that has been studied in various 


\-p 


admitted 


0 - 


diverted 

Figure 1 An illustration of the queueing admission control problem. 


A main message of lSoencer et al.l (j2014^ is that one can drastically reduce queueing delay with 
a sufficient amount of future information. In particular, there exists Ch > 0, such that if the length 
of the lookahead window satisfies 


VFa >Chln- 


1 


( 1 ) 


1-A’ 

then there exists a sequence of diversion policies, so that the resultant delay will stay bounded in 
the heavy-traffic regime of A—)• 1. In sharp contrast, when no future information is available, the 
delay under an optimal online policy will diverge to infinity, as A —^ 1. 


However, the requirement on the len gth of the looka 
superior delay performance achieved bv ISoencer et al 


lead window, as in Eq. dH), means that the 


2014( 1 comes at the expense of a non-trivial 


amount of predictive power. Therefore, it remains to determine whether one could use much less 
future information and still achieve a significant performance improvement over an optimal online 
policy. This question is of practical importance, because a larger amount of future information 
often requires more sophisticated predictive models and computational infrastructures, which can 
be costly, if not impossible, to build and operate. 

The main contribution of the present paper is to provide a negative answer to above question, 
by showing that there exists a positive constant, Ci, such that if W\ scales slower than Cilnj^ as 
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A —>• 1, then the resulting delay performance can be no better than that of an optimal online policy 
by more than a constant factor. As a by-product of our result, an interesting “conservation law” is 
established, which suggests that delay and future information are, in some sense, “exchangeable” 
quantities (see discussions in Section [^7T]l . 


Despite having i dentical mode 


those employed by ISoencer et al 


i ng ass umptions, our proof techniques are quite different from 
(I20141 ). The core of our arguments hinges upon a relationship 


between diversions and future idling of the server, evaluated over certain subset of input sample 
paths. This relationship is then used in conjunction with the excursion probabilities of a transition 
random walk to demonstrate that the system manager must maintain a relatively large queue 
length, when the amount of future information is limited. We believe that this line of arguments 
is fairly robust to changes in modeling assumptions, and can be generalized, in other dynamic 
resource allocation problems, to proving lower bounds for the amount of information necessary in 
achieving desirable performance. 

1.1. Organization 

The remainder of the paper is organized as follows. In Sect i on [H w e state our main result. Theorem 
[U and contrast it with the prior results of ISpencer et al.l (12014 ). In the same section, we discuss 
several implications of the theorem (Section l2.ip . as well as connections of our work to the literature 
fSection 12.21) . Section [3] describes the modeling assumptions in more details, and introduces the 
necessary mathematical formalism. The proof of Theorem [1] is given in Section UJ with an outline 
of the proof ideas provided at the beginning of the section. We conclude the paper in Section [5] 
and examine potential directions for future research. 

2. Main Result 


Review of Prior Results. We begin by informally reviewing the system model in ISoencer et al. 


(I2ni4 ). which will be described in detail in Section [3l The admission control problem runs in 
continuous time, and is characterized by three parameters: A, p, and Wx. An illustration of the 
system model is given in Figured] 

1. Jobs arrives to the system at the rate of A, where A G (0,1). There is a single server which 
processes jobs at the rate of 1 — p, where p is a fixed constant in (0,1). It is assumed that the 
system is operating in the overload regime, with A > 1 — p. 

2. Upon each job’s arrival, the system manager decides whether the job is to be admitted or 
diverted. If admitted, the job queues up in an (infinite) buffer until it is processed by the 
server, and if diverted, it leaves the system immediately. The goal of the system manager is 
to choose a diversion policy that minimizes the time-average queue length induced by the 
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admitted jobs, subject to the constraint that the infinite-horizon time-average rate of diversion 
does not exceed p. 

We will be primarily interested in the heavy-traffic regime of A —)• 1, where the post-diversion 
arrival rate approaches the server capacity of 1 — p, assuming that the system manager diverts 
at the maximum allowable rate of p. Note that by Little’s Law, the time-average queue length 
is equal to the time-average queueing delay multiplied by the post-diversion arrival rate of 
X — p. In the limit of A —)• 1, the two quantities will differ only by a multiplicative constant 
of 1 — p. Therefore, from this point on, we will focus on the time-average queue length as the 
performance metric, with the understanding that an analogous statement will hold for delay 
as well. 

3. The system manager has access to information about the future, which takes the form of 
a lookahead window of length W\: at time t, the times of arrivals and service availability 
within the interval W\] are revealed to the system managei0. The case of W\ = 0 will be 
referred to the online problem, since the system manager does not have access to any future 
information. 

Denote by Q(vr, A,ITa) the time-average queue length under the diversion policy vr, given arrival 
rate A and a lookahead window of length W\. Let Q.*{X,Wx) be the time-average queue length 
under an optimal diversion policy (assuming such optimal policies exist), with 


Q*(A,ITa) = minQ(7r, A,ITa), 


( 2 ) 


It is shown in 


Spencer et al.l (j2014l l that a finite amount of lookahead into the future is sufficient to 


yield significant delay improvement over an online policy. In particular, fixing p G (0,1), they show 
that the optimal average queue length for an online policy diverges to infinity in the heavy-traffic 
regime, with 


Q*(A,0)~log 


I 


as A —y I. 


I — A 

In sharp contrast, there exists a positive constant Ch, whose value can depend on p, so that if 

>Chln 


1-A’ 


( 3 ) 


( 4 ) 


for all A sufficiently close to I, then the optimal average queue length converges to a finite constant 
in the heavy-traffic regime: 


asA^I. 

p 


( 5 ) 


^ Depending on the application, one can think of the lookahead window as being provided by some external oracle, 
or a predictive model that has access to side information. 
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A main open question posed by ISoencer et alJ (j2014^ is whether significant performance gain 
over the online policy can still be achieved under much less future information. It is conjectured 
that if Wx = o ^In ; then the average queue lengt i will necessarily diverge to infinity in the 
heavy-traffic limit (Conjecture 1, Spencer et ah ([2014 11. In other words, a sufficient amount of 


future information may be essential in achieving superior delay performance. 

Our Result . The main result of this paper confirms, and strengthens, this conjecture of 


Spencer et ahl (j2014l l. We show that if the amount of future information is insufficient even by a 


constant factor, then not only will the delay be infinite in the heavy-traffic regime, but the delay 
scaling will essentially be no better than that of an online policy. Specifically, we have the following 
theorem. 

Theorem 1 (Necessity of Future Information). Fixp G (0,1). There exist C; > 0 and A G (1 — 
p, 1), so that if 


the 




Wx < Cl In 


1 


1-A’ 


VAg(A,1), 


Q*(A,IT;,) = 0 In 


1 


-L^A 


as A —^ 1. 


( 6 ) 


(7) 


Together with the results of ISoencer et al.1 (j2014ll , Theorem [T] suggests that the performance of 
the admission control problem depends critically on the amount of future information available, and 
in particular, on how the length of the lookahead window, Wx, scales relative to th e watershed of 


0 


^In . A graphical illustration of Theorem[Tl with a comparison to the results of Spencer et al 
(j20I411 . is provided in Figure [2J 


The proof of Theorem [T] is given in Section!^ It is wort 


1 noting that our proof techniques are 


quite different from those employed bv ISoencer et al.1 (12014( 1. In fact, they are somewhat “dual” to 


each other: the earlier achievability result (Eq. ([5|)) was proved by analyzing the distribution of the 
lengths of busy periods associated with the queue length process (a property in time), whereas the 
core of our arguments relies on the excursion properties of a transient random walk (a property in 
space). 


2.1. Implications of Theorem [I] 

There are several interesting implications of Theorem [TJ First, by virtue of being a lower bound 
for the case where the decision maker is given the exact realizations of future input. Theorem [1] 
automatically extends to settings where predictions can be noisy or corrupted, as is typically the 
case in practical applications. 

^ The notation f{x) = &(g(x)), as x ^ 1, represents the statement that, for any sequence Xn —>■ 1, we have 0 < 
liminf„^.oo f{xn)/g{xn) < limsup^^^ f{xn)/g{xn) < 00. 
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Figure 2 Impact of future information on the effectiveness of admission control, in the heavy-trafHc regime of 
A —>■ 1. The solid red segment corresponds to the regime established by this paper, where Wx =4 ci In 
(Theo rem [T]), and the dotted black segment corresponds to the regime established bv ISoencer et al 


( 2014l l. where Wx Ch In (Eq. (j4]) and ([5} in the current paper). The case of Wx = 0 is covered by 
either paper. 


Theorem [T] also implies an interesting “conservation law” between delay and future information: 
from Eqs. ([3]) through ([7]), we see that the sum of Q*(A, Wx) and Wx must be of order ^In , 
as A 1. In a rough sense, this is because the same type of stochastic discrepancies in the input 
processes, which necessitate large queueing delays in the heavy-traffic when future information is 
limited, also determines how much lookahead is required in order to achieve a bounded delay. Even 
though such conservation seems to suggest that there is no “free lunch” to be had, the ability to 
understand and make such trade-offs can still be useful, because depending on the application, 
future information may be significantly less costly than delay, or vice versa. 

From an operational point of view, although Theorem [T] invalidates the usefulness of future 
information in certain regimes, it is nevertheless reassuring to know that a simple online policy 
could do almost as well as any sophisticated prediction-guided policies, even when the amount of 
predictive information available grows as the traffic intensity increases. Moreover, the theorem does 
not rule out the possibility of having meaningful prediction-guided policies when future information 
is limited; it only implies that our search in such scenarios should aim at more mod erate, constant 


factor performance improvements over online policies. In fact, numerical results in 


Xu and Chan 


(|2014l ) on a similar admission control model suggest that sizable performance gains can still be 
achievable, even with limited and noisy predictive information. 

2.2. Related Work 


In terms of modeling assm nptions. our s e tup is identical to that of ISoencer et al.l (j2014l) , and 


hence we refer the reader to 


Spencer et al.l (j2014f) for a review of the model’s connections with the 
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literature on classical Markov admission control problems and competi tive analysis. The model is 
also related to a multi-server system with part ial resource pooling (cf. 


Tsitsiklis and Xu (201 


the reader is referred to Chapter 7 of IXul (1201411 for more details. In addition, IXu and ChanI (j2014fl 




examines the model’s relevance in the context of reducing waiting times at emergency departments. 

Our result can be viewed as a gene ralization of the Markov optimal admission control problem 
that has been studied in the literature (jStidhamI (jl985l l ), and it is interesting to contrast some of the 
differences in analytical approaches. Optimal policies in the Markov setting (W\ = 0) are known to 
often admit a threshold (or control-limit) form, where a diversion is made only if the current queue 
length reaches a fixed threshold. To prove the optimality of these policies, one would typically 
analyze the Bellman equations of the corresponding Markov decision process (MDP) in order to 
establish a set of monotonicity properties in the policy space, e.g., that the cost-to-go function for 
a threshold policy would be d omina ted by policies that divert with non-zero probabilities when the 
queue is small (c.f. lYechialil (jl97l[B . Successive applications of such monotonicity properties will 
then narrow the policy space down to only those with a threshold form. 

Unfortunately, these arguments employed in the Markov setting do not seem to carry over 
easily when the lookahead window is taken into account. While our setting can still be cast as 
an MDP by incorporating the lookahead window into the state space, the structure of the state 
space is now considerably more complex (and increasingly so, as Wx —?> oo), and it is not so clear 
as to whether any monotonicity property continues to hold. Our proof techniques circumvent this 
complexity by focusing on the “macroscopic” sample-path characteristics of the system, instead 
of the more refined details of the Bellman equations. As a trade-off, our analysis is more “coarse” 
by nature, and it provides neither a characterization of the multiplicative constant in the delay 
scaling, nor a concrete diversion policy that achieves the lower b ound of the ne c essary amount of 
future information (which, fortunately, has already been given in ISoencer et al.l (j2014M . 

Our work is also sim ilar in spirit to the techniq u es of i nformation r elaxat ion and path-wise 
optimization for MDPs (Rogers] (j2007ll . iBrown et al.l ( 201011 . jPesai et al.l ( 2012ll l. In this case, one 


considers an relaxed version of the original MDP, where the decision maker has access to realizations 
of the future input sample paths. This relaxed problem is often simpler to solve and simulate than 
the original stochastic optimization problem, and hence can be used, for instance, as a performance 
benchmark for evaluating heuristic policies. Our work is different from this literature in several 
aspects. Most notably, we focus on rigorously understanding the stochastic dynamics involved in 
the relaxed problem with future information, and how performance scales with respect to the length 
of the lookahead window, as opposed to using the relaxed problem to approximate the performance 
of an optimal online policy, which is well understood in our setting. 
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3. Model and Notation 

We now present the mathematical formalism and modeling assumptions that will be used through¬ 
out the remainder of the paper. An illustration of the system is given in Figured] 

System Dynamics. The system runs in continuous time, indexed by t G M+. There is a queue with 
infinite waiting room, whose length at time t is denoted by Q{t). The input to the system consists 
of two independent Poisson processes: 

1. A, with rate A, which corresponds to the arrival of jobs; 

2. S, with rate 1 —p, which corresponds to the generation of service tokens. 

When an event occurs in A at time t, we say that a job has arrived to the system, and the value 
of Q{t) is incremented by 1, if the job is “admitted” (see below for the description of admission 
policies). Similarly, when an event occurs in the process S at time t, we say that a service token is 
generated, and the value of Q{t) is decremented by 1, if Q{t) > 0, and remains at 0, otherwise^ 
For our purposes, it is more convenient to work with the sequence {{Zn,Rn) '■ n G N}, where 

Zn = time of the nth event in AlU5, (8) 


and Rn encodes the type of the nth event, with 

^ _ f 1, if the nth event is in A (arrival), 

" \ “1) if fhs event is in S (service token). 

We will let {M{t) : t G M+} be the counting process associated with {Zn}^ with 


(9) 


M{t) = sup{n G Z_|_ : < t}, 


( 10 ) 


and denote by S{s,t) the difference between the numbers of arrival and services tokens in the 
interval {s,t], 

S{sA)= ^ Rn- (11) 

Note that when A 7^ 1 — p the process {^(Ojt) : t G M+} is a transient random walk, with 


E(5(0,t)) = [A-(l-p)]t. 


( 12 ) 


Future Information. The notion of future information is captured by a lookahead window. At 
any time t, the system manager has access to the realization of all events in Al U 5 in the interval 

® The generation of a service token at time t can be thought of as the server being able to fetch a new job from 
the queue at time t. As such, the service token model attributes the randomness in processing times to an external 
source, which does not depend on the identities of the jobs. It can be shown that, in the online setting, the service 
token model is equivalent to the more conventional assumption of exponentially distributed job sizes, though such 
equivalence is gener ally not true when future information is taken into account. The reader is referred to Page 9 of 
ISpencer et all (l2014l 'l. and the references therein, for more details on the service token model. 
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[t,t + Wx]. Throughout, we will denote by W\ the length of the lookahead window, under arrival 
rate A. 

Admission Policies. Upon arrival, each job is either admitted, in which case it joins the queue, or 
diverted, in which case it disappears from the system immediately. The role of a diversion policy, tt, 
is to output a sequence of diversion decisions for all events, represented by the sequence of indicator 
variables, {H{n) : n G N}, where 


H{n) = I{Rn = 1, and tt chooses to divert at time Z„} . 


(13) 


Given the form of future information, we will require that the diversion policy be (t + lUA)-causal, 
so that the decision made at time t does not depend on any event after time t + W\. A diversion 
policy is said to be feasible, if the resulting time-average rate of diversion is at most p, i.e., 

N \ 

<p. (14) 


limsup — -M ( H{ 

N^oo N I ^ 


n] 


<n—l 


where the constant A + 1 — p corresponds to the total rate of events in AlU5. The objective of the 
decision maker is to choose a feasible policy, tt, so as to minimize the time-average queue length. 


defined b 


•H 


/1 ^ 

Q(7r, A, IUa) = limsupE — y^Q(Z, 

jV^nc \ iV ' 


(15) 


n—1 


3.1. Notation 


We will assume that all asymptotic expressions with respect to A are taken in the limit of A —)• 1. 
We will use f -^g and f ^ g to denote / = o{g) and f = 0{g), respectively. We will write f <g 
to mean that /(x) < g{x) for all x sufficiently closely to 1, i.e., that there exists y G (0,1), such 
that /(x) < g{x), for all x G (y, 1). The expressions / i§>, and ^ g are defined analogously to 
their respective counterparts. When a statement is made concerning the limit “as x —1”, without 
specifying the exact sequence with respect to which the limit is taken, it is understood that the 
statement should hold for any sequence, {x„}, with lim„_>oo®n = 1- The notation X = Y means 
that the random variables X and Y have the same distribution. 


4. Proof of Theorem [T] 

The remainder of the paper is devoted to the proof of Theorem [TJ We begin with a high-level 
summary of the main steps involved. First, we argue that there exists a stationary optimal policy, 
which makes decisions only based on the current queue length and the content of the lookahead 
window. Furthermore, the queue length process under this stationary policy admits a well-defined 

^Throughout, /(x—) represents the limit f{y). 
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steady-state distribution iSection l4.1.ip . This stationarity will allow us to simplify the analysis by 
focusing on the policy’s actions over a finite time horizon. 

We will prove Theorem [1] by contradiction, where we start by assuming that a small average 
queue length is indeed achievable under an optimal stationary policy, even with a small lookahead 
window, and later refute this assumption. Our main arguments are based on the identification of 
a set of base sample paths fSection 14.2p . with the property that any feasible policy must perform 
poorly over these sample paths, should the length of the lookahead window be too small. The 
stationarity property described earlier will then allow us to extend this argument to showing the 
policy’s failure over the infinite time horizon. It is worth noting that the base sample paths are 
not “typical,” in the sense that their occurrences possess only vanishingly small probability, as 
A —)• 1. This is because the failures of a policy under a small lookahead window are not caused by 
the average behavior of the inputs, but rather by some rare excursions of the random walk ^(O, •). 
Though occurring with small probabilities, these excursions are in some sense unforeseeable under 
a small lookahead window, and their existence forces an optimal policy to be overly restrained in 
diverting jobs and hence yield a large average queue length. 

To carry out the arguments using the base sample paths, we will exploit a key relationship 
between diversions and server idling. In particular, we will demonstrate that, without sufficient 
lookahead, if a constant fraction of the arrivals are diverted during a specific portion of a base 
sample path, it will inevitably result in excessive idling of the server not far away in the future, 
even as A—)• 1. However, such server idling cannot occur in the heavy-traffic limit, since the server 
must be fully utilized in order to ensure system stability. This reasoning then implies that any 
policy that makes such diversions must be infeasible, or conversely, that any feasible policy must 
divert very few arrivals over these segments of the base sample paths (Proposition [T|). However, 
such conservatism comes at a cost, in that it leads to long episodes during which the queue length 
stays at a high level (Proposition [2]). We then argue that the frequent appearances of such “bad” 
episodes will result in a large average queue length in steady-state, which contradicts with our 
initial assumption and hence completes the proof of Theorem [TJ 

4.1. Preliminaries 

Without loss of generality, we will consider only the cases where the length of the lookahead 
window, WA, diverges to infinity in the heavy-traffic regime, i.e., 

WA^oo, asA—)-l. (16) 

To see why this is justified, note that because we can always achieve the same average queue 
length with a longer lookahead window, the optimal average queue length must be 
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monototically non-increasing in W\. Therefore, any lower bound we obtain on Q*{\,Wx) under 
the assumption of Eq. (fT6)l also applies to the case where Wx = 0(1). For simplicity of notation, 
we will drop the dependency on Wx, and denote by qx the optimal average queue length, 


qx = Q*{X,Wx), VA€(1-p,1). 


(17) 


Main Assumption. We will assume the validity of the following hypothesis throughout the remain¬ 
der of the proof, which states that it is indeed possible to achieve a small delay as long as Wx is of 
order SI ^In • As will be shown in Section Fd.Sl invalidating this hypothesis will imply the lower 
bound in Theorem [TJ 


Hypothesis 1. Fix p € (0,1). Suppose that Wx ^ In as A — )• 1. Then 

<C In--, asn—)-oo. 

1 — A 

Assuming the validity of Hypothesis [U it also follows that if Wx > In , as A —>• 1, then 


(18) 


Qx ^ Wx, as A —y 1. 


(19) 


4.1.1. State Representation and Stationary Policies We show in this section that there 
always exists an stationary optimal policy that depends only on the state, which consists of the 
current queue length and content of the lookahead window. 

Since all diversion decisions are associated with events in AlU5, it suffices to specify the nature 
of future information for the event times, {Z^ ■ n G N}. At t = Z^, the content of the lookahead 
window is defined to be the vector F{n) = {Fk{n) : k G Z+), where 


Fk{n) — {Zn+k — Zn , Rn+k), 0 < fc < W(Z„-|-VTx) — AA(Z„). (20) 

In other words, Fk{n) specifies the time of the feth future event starting from the current time, 
Zn, along with its type for all events within the lookahead window of length Wx- For future events 
beyond the lookahead window which we have no access to, we simply set the value of Fk{n) to 
zero: 

Ffc(n) = (0,0), A:>W(Z„ + Wa)-AA(Z„). (21) 

Recall that Q{t) is the queue length at time t. Consider the sequence {X{n) : n G N}, where 

A(n) = (Q(Z„-),F(n)). (22) 

From this point on, we will refer to {X{n) : n G N} as the states of our system. 
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Stationary Policies. A diversion policy tt is stationary, if its diversion decision at time depends 
only on the state, X{n), or formally, that 


{H{n) = 11 X{n)) = P (if(n) = 11 , a.s. 


(23) 


A stationary policy, tt, is stable, if the evolution of {X{n) : n G N} under tt admits a well-defined 
steady-state distribution, 7 , so that steady-state queue length and probability of diversion coincide 
with the time-average queue length and diversion rate, respectively, given that the initial condition, 
A(l), is distributed according to 7 . 

In our admission control problem, because the arrivals and service tokens are generated according 
to Poisson processes, future evolution of the system starting from f = is independent conditional 
on the current state X„ and diversion decision. As such, our problem can be cast as a discrete¬ 
time Markov decision process (MDP), with states {X„ : n G N} and act ions that correspond to 


the pr o 

(bood). 


rabilities of diversion. Using existi n g resu lts in the literature (c.f. 


Hernandez-Lerma et al 


Gonzlez-Hernandez and Villarreall (1201 IIP , it can be shown that, for MDPs of this kind, 


there exists an optimal policy that is also stationary and stable. This is summarized in the following 
lemma, whose proof is given in Appendix lA. II 


Lemma 1. Fix any p> 0, A G (1 — p,l), and W\ > 0. The admission control problem admits a 
stable stationary optimal policy, vr, which achieves the minimum time-average queue length among 
all feasible diversion policies. 


In light of Lemma dl we will, in the remainder of the proof of Theorem [T] focus on the family of 
stable stationary policies, which we will refer to simply as stationary policies. Given a stationary 
policy, TT, the resultant state sequence {X{n) : n G N} is a stationary Markov chain. Since we are 
interested in deriving a performance lower bound, we may assume that, at time t = 0 , both the 
queue length and the content of the lookahead window are initialized according to the steady-state 
distributions, 7 . In particular, we have that 


E(Q(t))=E(g(0)) = Q(^,A,IT;,), tGM+. 


(24) 


and,that 


E(i^(n)) = E(i^(l)) = limsup ■ 

N—¥-co 


N 


Vn G N. 


Define the process {L{t) : t G M+}, where 


L{t)=I {Q{t) < 2qx} , t G M+ 


(25) 


(26) 


The following lemma will be useful. 
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Lemma 2. Fix p G (0,1). For all A G (1 — p, 1), we have that 

E(L(t)) = P(Q(0)<2(z^)>i VtGM+, (27) 

under any optimal stationary policy. 

Proof. The result follows from the stationarity of Q(-) and the Markov’s inequality: 

E(L(t)) = P (Q(t) < 2q^) = P (Q(0) < 2q^) = P (Q(0) < 2E(Q(0))) > (28) 

Q.E.D. 

In the remainder of the proof, we will show that there exists q > 0 such that if W\ ^ Cilny^, 
then Eq. (fTTl) cannot be true under any sequence of optimal stationary policies, unless Q*(A, Wx) ip 
In This would invalidate HypothesislH which would in turn prove the lower bound on Q*(A, Wx) 
in Theorem [TJ 

4.2. Base Sample Paths 

We now describe the construction of a set of base sample paths which will serve as the basis of 
our subsequent analysis. In later sections, we will show that, roughly speaking, the non-negligible 
chance of occurrence of such sample paths will “force” any feasible policy to be overly conservative 
in diverting jobs, should Wx be too small. 


Figure 3 




—c 



dashed 

dotted 


Wy+B 2Wy^+B 

[u,] (c/3) 


c/,+z 


This figure illustrates the “macroscopic” behavior of the base sample paths. The dashed blue segment 
between Wx and Wx + B represents a period of sustained upward drift of S'(0,-)i and the dotted red 
segment starting at 2Wx + B represents a downward drift. The two solid black segments, each with 
length equal to that of the lookahead window, serve as a “buffer”, ensuring that the actions of the 
diversion policy before the segment are independent from the evolution of S(0,-) afterwards. 
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Let B € M_(. be a quantity whose value will be specified in the sequel. We define the following 
time markers, whose positions relative to each other are illustrated in Figure [3l 

Ui = W>^, 

U2 = U^ + B = Wx + B, 

Us = U2 + Wx = 2Wx + B. 

The set of base sample paths is dehned as the intersection of the events £i through T 5 , described 
as follows. Let e, C and (p be positive constants. 

1. Event £ 1 , parameterized by e and (, says that the sample path of 5(0, •) stays close to its 
expected behavior during the interval {Ui,U 2 ]- 

= {|5([/i,t) - [A-(1-p)]t| < et+ C, for alHe (C/i,[/ 2 ]}, (29) 

When e is small, this implies that 5(0,-) undergoes a consistent upward drift during (C/i, { 721 - 
Event El is illustrated by the dashed blue line segment in Eigure[3j 

2. Event £2 says that the queue length at t = 0 is not too large compared to the optimal average 
queue length, 

£2 = {Q(0) < 6 ^;,} . (30) 

3. The events T 3 and T 4 put some restriction on the amount of upward excursion of 5(0, •) during 
the intervals {0,Ui] and { 1 / 2 , 113 ]^ respectively, 

£3 = {S{0,Ui)<2Wx}, (31) 

54 = {5(C/2,{73)<2W;,}, (32) 

The main purpose of £^ and £4 is to serve as “buffers” to induce certain independence property, 
which will be useful for subsequent analysis: since the lengths of ( 0 , Ui] and {U 2 , U 3 ] are both 
equal to that of the lookahead window, the actions of the diversion policy before each interval 
are independent from the evolution of 5(0,-) after it. The two events are illustrated by the 
black line segments in EigureO 

4. Einally, the event £5 says that 5(0,-) will undergo a substantial downward excursion soon 
after U 3 , as is illustrated by the dotted red line segment in EigureO Let Z be the stopping 
time 

Z = mf{zeR+-.S{U„U 3 + z)<-[ 6 qx + [X-{l-p)-e]B + C + 4Wx]}, (33) 

and £5 is defined by putting an upper bound on Z: 


£, = {Z<cpWx}. 


(34) 
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The right-hand-side of the inequality in the definition of Z was chosen so that, conditional on 
the joint occurrence of £i through £4, a downward excursion in 5(0,-) of such magnitude is 
guaranteed to deplete the queue by time U3 -|- Z. As will become clearer in the next section, 
this depletion will help us connect diversions to future idling of the server. 

Note that the events £1, £3, £4 and £3 concern the input sample path 5(0, •) only, and are indepen¬ 
dent of the diversion policies, while £2 also depends on the choice of diversion policy. 

Having described the events that together characterize the base sample paths, we next illustrate 
some of their statistical properties. The first lemma shows that the events £4 through £4 can occur 
with fairly high probabilities. The proof is given in Appendix IA.2I 

Lemma 3. 1. Fix e > 0. For all 6 G (0, 1), there exists C > 0, so that for all \ > 1 — ^p, 

infP(5i)>0. (35) 

B>0 

2. Under optimal stationary policies, P {£2) = P((5(0) < 6q\) > |, for all A G (1 — p, 1). 

3. limA^iP(53) = limA^iP(54) = 1. 

The next lemma shows that the event £5 occurs with a small yet non-negligible probability. The 
proof is given in Appendix IA.3I 

Lemma 4. Fix k, 4 >,C,> 0, and e G (0, min{C,A— (1 — p)}). Suppose that B = kWx, and q\ <C W\, 
as A —)• 1. There exists 7 > 0, such that 

P(^ 5 ) ^ exp(—qlTA), asA—)-l. (36) 

Finally, the following independence properties among the events will be useful. The proof is given 
in Appendix I A. 41 

Lemma 5. Fixing a feasible diversion policy, the following holds. 

1. The events £4,£3,£4 and £3 are mutually independent. 

2. The event £2 is independent of £4,£4 and £3, but not necessarily £3. 

3. Denote by Y the number of diversions in the interval (C/i,C/ 2 ], i.e., 

Y= Y. ( 37 ) 

AA(C/i) + l<n<A^(C/2) 


Then Y is independent of £3. 
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4.3. From Diversions to Server Idling 

The goal of this subsection is to show that, if Wx is small, then the number of diversions made 
during the the interval {Ui,U 2 ], i.e., the random variable Y (Eq. ([571) 1. must also be appropriately 
small, under any optimal stationary policy. To achieve this, we will exploit a connection between 
Y and the idling of the server at a later time. 

The intuition is perhaps best seen pictorially, as depicted in Figure El Conditional on the occur¬ 
rence of the events £i through T 5 , and suppose no diversion has been made, the queue length 
process Q{t) would have “followed” the trajectory depicted in the figure and reached zero by time 
U 3 + (f>Wx. Suppose now that a large number of diversions are made during the interval {Ui,U 2 ] 
(dashed line segment in blue), the depletion of the queue implies that there must be an extended 
period of server idling prior to C /3 -|- (jjWx. Such idling, if it persists even as A —)• 1, can be prob¬ 
lematic and will be shown to contradict the feasibility of the diversion policy. This in turn implies 
that the number of diversions in (C/i, C/ 2 ] must be small. 

The next proposition is the main result of this subsection, which formalizes the above intuition. 
There is, however, one adjustment: as opposed to conditioning on all five events, which has van¬ 
ishingly small probability due to the presence of we will condition only on £i and £ 2 , which 
occur with high probability. To do so, we will exploit several independence properties among the 
events, as in Lemma O and show that the impact of ^(O, ■)’s downward excursion described by T 5 
is unavoidable when Wx is too small, even without explicitly conditioning on £^. 


Proposition 1. Fix k>0, 

and let B = kWx- There exists c > 0, so that if 



WxYcln- --, asA—)-l, 

i — A 

(38) 

then for every t > 0 , 

lim p (y > tB I £i n £ 2 ) = 0, 

A-s-l ^ ' 

(39) 


under any sequence of optimal stationary policies, where Y is the number of diversions during 
{Ui,U 2 ], defined in Eq. ((371) . 


Proof. We say that a service token generated at time t is wasted, if there is currently no job 
in the queue, i.e., Q{t) = 0. Let {//(C) : C € M+} be the counting process of wasted service tokens, 
where 

J{t) = ff of wasted service tokens in [0, C]. (40) 

For the sake of contradiction, assume the following is true: if Wx A —)■ 1, then there 

exist r > 0 , and a sequence of optimal stationary policies, {vt;^}, under which 

liminfP {Y > tB | £i ^£ 2 ) = g > 0. 


(41) 
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The following lemma is a key ingredient to the proof, which says that the number of wasted tokens 
must be substantial. The proof is based on the intuition explained in the passages above Proposition 
dl and is given in Appendix IA.51 

Lemma 6. Fix k > 0, and let B = kW\. Suppose Eq. m is true for some sequence of optimal 
stationary policies, {vt;,}. Then there exist 0,7 > 0 (whose values can depend on k) such that 

E{J{aW^)))^W^ei^p{--fW^), (42) 


as A—>■ 1, under {vta}. 


Consider an optimal stationary policy. Denote by 'H(t) the counting process representing the 
number of diversions in [ 0 ,t], i.e., 


7V(t) 


nit) = Y,H{n). (43) 

n—1 

By the stationarity of {H{n) : re £ N} (Eq. (US])) and definition of M{t) (Eq. (fT0|) l. it is not difficult 
to show that, for all t > 0 , 


(E = (A + 1 - P)E(^f(l)) 

(A + l-p)Efel.J/(n)) 

= lim sup-—-. 

Ar-).oo -/V 


(44) 


By definition, we have that 


Q{t) = Q{0) + S{0,t) + J{t)-n{t), Vt> 0 . (45) 

Taking expectation on both sides of the above equation, and letting t = aW\, where a is given as 
in Lemma [U we have that 

E{n{aWx)) 

aW^ ^ 

=4r (-^(0> aWx))+E{J{aW^)) + E(Q(0)) -E(Q(alTA))) -p 

aW), 

^X-{l-p)]-p+^E{J{aWD) 

(b) 1 

(A - 1) + —^Wx exp {--iWx) 
aWx 

>exp(- 7 lLA) - (1-A), (46) 


where 7 is given in Lemma [H Step (a) follows from the fact that E(Q(0)) =E(Q(alTA)) by the 
stationarity of Q(')) (b) from Eq. (f^ . 
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Letting Wx = cln with c = I/ 27 , we have that 

exp {—^W\) y/l — asA—)-l. 

Combining Eqs. (l46|) and (l47l) . we have that 


K{H{aWx)) 

aWx 


— A, as A —)• 1. 


In particular, this implies that there exists A' € (1 — p, 1), such that 

n'HiaWx)) 


aWx 


>p, VAg(A',1). 


Since the stationary diversion policies we consider are feasible, we must have that 

nnt)) (a), (A + 1 - p)E H (n)) (,) 

- - -= hmsup- - -^ < p, 

r v-s-oo -'V 


(47) 


(48) 


(49) 


(50) 


for all A G (1 —p, 1), and t > 0, where (a) and ( 6 ) follow from Eqs. (|44l) and (fT4)) . respectively. This 
leads to a contradiction with Eq. (I49p . which invalidates the assumption made in Eq. (|41l) . and 
hence proves Proposition [TJ Q.E.D. 


4.4. Consequences of Too Few Diversions 

Proposition [ 1 ] tells us that, under optimal stationary policies, the number of diversions in (Ui,U 2 ] 
must be small when W\ is small. Building on this observation, we now focus on policies that 
divert “very few” jobs during {Ui,U 2 ], i.e., with Y scaling sub-linearly with respect to B, and 
show that they will necessarily lead to a large expected queue length in steady-state. The following 
proposition is the main result of this subsection. 

Proposition 2. EixpG (0,1). There exists ci > 0, so that if 

ITA^olnY^^, asX-^-l, (51) 

then 

limsupE (L(0)) <-. (52) 

A-s-l 3 

under any sequence of optimal stationary policies. 

Proof. We will assume that B = kW\, with k = 24, and that W\ P Ci In where C; is equal to 
the constant c in Proposition [1] for the corresponding value of k. 

Consider an optimal stationary policy, with a resultant average queue length of qx. We will prove 
the claim by showing that if £iri £2 occurs and the number of diversions made in {Ui,U 2 ] is small 
(cf. Eq. (f39|) j. then, for a “long time” after Ui, the queue length will stay at a high level (i.e.. 
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Q{t) > 2qx). Recall that Y is the number of diversions made during the period {Ui,U 2 ]. We have 
the following inequality, derived from the queueing dynamics: 

Q{t)>Q{Ui) + S{Uut)-Y, yte{U„U 2 ]. (53) 

By the dehnition of £i (Eq. (I29ll l. Eq. (I5^ . and the fact that Q{Ui) > 0, we have that 

f’{Qit)>[X-{l-p)-e]t-C-Y\S,n£2) = l, yte{U„U2]. ( 54 ) 

Let V be the last time in (Ui, U 2 ] when the queue length becomes less than 2qx, with 


V = sup{te [0,R) :Q{Ui+t) <2qx}, if <2qx, 

and V = 0, otherwise. Applying the definition of V in the context of Eq. ([Mil yields that 

1 


P E< 


i2qx YY + ^ + 1 ) 


£1^82 1 — 1 


(55) 


(56) 


A- (1-p) -e 

Recall from Proposition [T] that, conditional on £i □£’2 and assuming Wx R must be 

sub-linear in R = kWx- In particular, by Eq. (f39l) . we have that, for all r > 0, 


lim P (y < TkWx I £1 n £2) = 1. (57) 

Combining Eqs. (1^ and (fF7|) . and the fact that Wx -^00 as A —)• 1, we have that, there exists 
u > 0, such that for all r > 0, 


r{V<vqx + TkWx\£in£ 2 ) = l-S{X), VAg(1-p,1), (58) 

where S{-) is a function with lim 3 ;_n 5(x) = 0. In other words, conditional on £ 1082 , Q{t) will reach 
the level of 2qx soon after Ui, with high probability. Using the fact that V < U 2 , Eq. (|58)l further 
implies that 


E (U I n £2) < {vqx + TkWx)il - d{X)) + t/2<5(A) < vqx + rkWx + U2<5(A) 


(59) 


Translating this into the value of E(U), we have that 

lim sup ^ ^ < lim sup — (E(U | £i n T 2 )E(^i C £ 2 ) -t- 172(1 — E(Ti n £ 2 ))) 
x-^l £2 A->-i 1/2 


1 - P(Ti n £2) + (vqx + TkWx + U 25 {X)) 

02 


(a) 

< lim sup 

A-H 

=himsup (l—P(TinT 2 )H—ry—'rP(Ti n T 2 ) 

A->-i V ^2 

= lim sup (l- P(Ti n £2) + ^ rP(Ti n £2) 

A^i V ^ + I 

<r + lim sup (1 — P(Ti H £2 )), 

A^-l 


(60) 
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where step (a) follows from Eq. (f59l) . ( 6 ) from the assumptions that qx ^ Wx and limA_>i (5(A) = 0, 
and (c) from the fact that U 2 = B + Wx = {k + \)Wx- We now connect the behavior of E(y) to that 
of E(L(0)) = P(Q(0) < 2qx), as follows. Fixing any A G (1 —p, 1), we have that 






U 2 

U,+E{V) 

U2 


(61) 


where step (a) follows from the stationarity of the process L{-), which in turn follows from the 
stationarity of Q(-). Step (b) follows from the fact that L(t) =0, for all t G [Ui + V, C/ 2 ], which is a 
consequence of the definition of V in Eq. (I5^ . Step (c) is based on the fact that L{t) < 1, a.s. By 
Eq. dSH), we have that 


limsupE (L(0)) < limsup 

A-H A-s-l C/2 

(а) Wx E(E) 

= n. ■ iMrA + limsup —— 
[k-\-l)Wx A-s-l c /2 

( б ) 1 

<Y + r+ limsup(l -P(£)in£) 2 )), 

k A^i 


(62) 


where steps (a) and (b) follow from the fact that B = kWx, and Eq. (l60l) . respectively. 
By Claim 3 of Lemma [3l and Claim 1 of Lemma [5l we have that 


liminf P (£’1 n £ 2 ) = liminf P(^i) P (£ 2 ) > -9, 

A—>1 A-s-l 6 


(63) 


where 9 is given in Eq. ([35l) . Set t = k = 24, and let ( be sufficiently large so that 9 > 10/9. We 
have that 

limsup(l - P (^1 n ^ 2 )) < 1 - ^ = 1/4. (64) 

A-sl 6 10 

From Eq. dSD), we have that 

lim^upE (L(0)) <^ + T + {l- P(fi ^ ^ 2 )) < ^ ^ ^ (65) 

which completes the proof of Proposition [21 Q.E.D. 








Xu: Necessity of Future Information in Admission Control 


21 


4.5. Proof of Theorem [T] 

We now complete the proof of Theorem [TJ Assuming the validity of Hypothesis [H Proposi¬ 
tion [5] asserts that there exists C; > 0, so that if Wx ^ CjIuy^ as A —^ 1, we must have that 
limsup;^_^;^E(L(0)) < 1/3 under any sequence of optimal stationary policies. However, this con¬ 
tradicts the requirement that E(L(0)) > 1/2, given in Eq. which holds independently of the 
validity of Hypothesis [TJ Therefore, we conclude that Hypothesis [1] must be invalid. 

The invalidity of Hypothesis [1] establishes the lower bound in Eq. ([7|), as follows. The negation 
of the statement of Hypothesis [1] directly implies that there exists C; > 0, so that if Wx ^ C; In 3 -^, 
as A —)• 1, then, for any sequence {A„} in (1 —p, 1 ), with lim„_,.oo A„ = 1, we have that 


lim sup 

n—¥C!0 


Q*{N,Wx„ 

In 


1 —An 


> 0 . 


( 66 ) 


We can further strengthen Eq. (| 66 p . and claim that, for any such sequence, we also have that 


n—foo In 


(67) 


1 —An 


To show Eq. (I67h . suppose, for the sake of contradiction, that lim inf „ 




In 


1 —An 


= 0 , 


for some sequence {A„}. This implies that {A„} admits a subsequence, {Ajj^.}, such that 
limsupj,_,,oo —= 0. The existence of the sequence {A„j.} contradicts Eq. (| 66 ll . This proves 
Eq. (1571) . which in turn establishes the lower bound in Eq. ([7|), i.e., that if Wx :< Ci In as A —)• 1, 
then 

1 


Q*{X,Wx)>ln 


1-A 


as A —y 1. 


( 68 ) 


Finally, we show that the lower bound in Eq. 0 is achievable, i.e., that 


Q*(A,WA)^ln 


1-A 


8 -S A —^ 1, 


(69) 


when Wx ^ Ci In To this end, we invoke Theorem 7 in lSoencer et al.l (j20l4 ). which shows that 
a deterministic queue-length-based diversion policy can achieve the scaling of Eq. (I69|l . even when 


Wx = 0@ This completes the proof of Theorem [U Q.E.D. 


® As is described in ISpencer et all (|2014| ). the scaling in Eq. (1691) can be achieved by the following simple threshold 
policy: divert the arrival if and only if the current queue length is equal to a threshold value x, where x is set to be the 
smallest value such that the resultant rate of diversion is no more than p. Since the queue length process under this 
policy is simply a birth-death process truncated at state x, it is easy to verify, via a direct calculation of steady-state 
probabilities of Q{t), that qx ^ In as A ^ 1. 
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5. Conclusions and Future Work 

In the context of a class of queueing admission control problems, we showed that a non-trivial 
amount of future information is necessary in order to achieve superior heavy-traffi c delay perfor¬ 


mance compared to an online policy. Theorem [T] also resolves a conjecture posed bv ISoencer et al. 


(j2014l l. Our proof exploited certain excursion properties of a transient random walk, which allowed 
us to connect a policy’s diversion decisions to subsequent system idling. 

There are several interestin g avenues of future research. First, in light of Theorem [1] and the 
results of lSoencer et al.l (j2014f) (Eq. dH)), an immediate question is whether the constants Ch and 


Cl in the scaling of W\ coincide. The granularity of our proof technique does not appear to be 
sufficient to answer this question, which likely demands a finer analysis. 

Because our proof relies mostly on the macroscopic properties of the input sample paths, the 
techniques and resultant insights in this paper seem to be fairly robust and can potentially be 
generalized to derive lower bounds on the necessary amount of future information for other resource 
allocation problems. For example, one generalization could be for a setting where the arrival and 
service token processes are non-Poisson (e.g., renewal or phase-type processes). In this case, we 
expect similar arguments to work when the process, 5(0,-), admits similar excursion properties 
as in the case of Poisson processes, and does not exhibit substantial long-range correlations (for 
otherwise, one could potentially obtain more future information by looking into the history of 
past inputs). Another possibility would be to consider systems with multiple queues, in which 
case the relevant excursion properties of the input processes would likely be connected to those of 
random walks in higher dimensions. Yet another variation would be to relax the hard diversion rate 
constraint, and consider instead the scenario where the system manager is interested in minimizing 
some combined cost as a function of the delay and diversion rate. However, depending on the cost 
function, one may need to adjust the performance metric or regime of interest, since the system 
may not ever have to become critically loaded, simply because the cost structure would encourage 
a higher rate of diversion as the system load increases. 

Finally, at a higher level, while our result focuses on the quantity of future information, mea¬ 
sured by the length of a lookahead window, there is another important dimension of quality. For 
instance, the observed future input may differ from the actual realizations due to prediction noise, 
or alter natively, only d i stribu tional information of future input is available. N either our resu l ts, nor 


those of 


Spencer et all (j20I4ll . deal with the impact of prediction noise, and IXu and ChanI (12014) 


considers only a specific noise model induced by random no-shows. A rigorous understanding of 
the impact of prediction accuracy in the context of dynamic resource allocation problems could be 
a promising direction for future research. 
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Appendix 


A. Additional Proofs 
A.l. Proof of Lemma [T] 

Proof. We will formulate our admission control problem as a discrete-time Markov decision process 
(MDP), and invoke existing results to verify the existence of a stable stationary optimal policy. 
Recall that state of the system at the nth step is Xn = (Q{Z„—), Fn), where was defined in 
Eq. (I20l) . Define A as the set 

X = Z+x[-l,wf. (70) 

Note that can be represented as an element in X for all n, because Q{Zn—) is the queue 
length just before the nth event and hence belongs to Z_|_, and each coordinate of which either 
represents the type of an event or an inter-arrival time upper-bounded by W\^ lies in the interval 
[—1 ,VFa]. The following topological properties of X are useful, whose proof is given in Appendix 

D 


Lemma 7. The following holds. 

1. X is Polish, i.e., it is complete and separable. 

2. Under an appropriate metric, the set {x £ X : Xi < a} is compact for all a € M+. 


The MDP associated with our admission control problem is defined as follows: 

1. The state space is X, defined in Eq. (FTP)) . 

2. The action space, C, is the closed interval [0,1], and the action at step re, /„ £ £ , specifies the 

probability of diversion, i.e., = P(i7(re) = 1). Denote by C{X) the set of allowable actions 

when the system is in state X. Then £(A„) = [0,1] if A„(0) = 1, which corresponds to the 
reth event being an arrival, and £(A„) = {0} if 5„(0) = 1, which corresponds to the reth event 
being the generation of a service token. 

3. The stochastic kernel is the one associated with the Poisson arrival and service token processes, 
as well as the queueing and diversion dynamics. 

4. The reth step is associated with a penalty, /(A„,/„), which is equal to the queue length, 
Q{Zn—). It also incurs a cost, c{Xn,ln), which is equal to the probability of diversion, 

5. The objective is to minimize the time-average penalty, defined in Eq. (I15|l . subject to a 
constraint on the time-aver age cost, defined in Eq. (T^. 


Theorem 3.2 and Lemma 3.5 of 


Hernandez-Lerma et al 


admits a stable stationary optimal policy, p rovided that a set of conditio ns are satisfied, which 
are given in Section 2 and Assumption 3.1 of[ 


Hernandez-Lerma et al 


( 2003 1 show that an MDP of this kind 
nditio ns are satisfied, which 
(j2003ll . These conditions are 


met by our MDP, and we highlight a few among them: (1) the state space is Polish (by the first 
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claim of Lemma [7]), (2) the set {{X,l) G {X,C) : < a} is compact for all a G M+ (by the 

second claim of Lemma[7]), (3) c(X„, In)-, which in our case is simply equal to In, is non-negative and 
lower semi-continuous in In for every state G X, and (4) the stochastic kernel satisfies a certain 
weak continuity condition, which essentially requires the distribution of not vary abruptly as a 
function of the state-action pair and this continuity condition can be verified by using the 

definitions of Poisson processes and the associated queueing dynamics. This completes the proof 
ofLemmadl Q.E.D. 


A.2. Proof of Lemma [3] 

Proof. Recall from Eq. that S{s,t) is defined as the difference between the numbers of arrivals 
and service tokens in {s,t]. Since the arrival and service tokens processes are independent Poisson 
processes with rate A and 1 —p, respectively, it is not difficult to verify that 


N, 


S{s,t)=Y,Xn 


(71) 


where Ns^t is a Poisson random variable with mean {X + 1 — p){t — s), which corresponds to the 

total number of events in (t, s], and the XnS are i.i.d., with 

1, w.p. 

— 1 , otherwise. 


Xi = 


(72) 


By Eq. dn]), and the fact that limB_>.oo = A -|- 1 — p almost surely, Claim 1 follows from 

a variation of the standard Eunctional Law of Large Numbers (FLLN) for the sum of bounded 
i.i.d. random variables. Claim 3 follows from the Weak Law of Large Numbers applied to the sum 
of i.i.d. Poisson random variables, and our assumption that Wx ^ oo as A —>• 1 (Eq. (|16l) i. Finally, 
Claim 2 follows from the Markov’s inequality, in the same way as in Eq. f|27p . by noting that 
E(Q(0)) = qx under an optimal stationary policy. Q.E.D. 


A.3. Proof of Lemma [4] 

Proof. Based on the stationarity of A and S, and the assumption that B = kWx and qx ^ Wx, it 
suffices for us to show, that for any a,b> 0, there exists 7 > 0 , such that 


P(5(0,aWA) <- 6 Wa) >exp(- 7 WA), asA^l. (73) 

By definition, the distribution of S{0,t) can be written as 

S{0,t) = Axt-D^,_p)t, (74) 

where Axt and are independent Poisson random variables with mean Xt and (1 — p)t, 

respectively. The following lemma follows from the standard large-deviation principles of Poisson 
random variables, and its proof is omitted. 
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Lemma 8 . Let be a Poisson random variable with mean x. Then, for all Ci > 0, there exists 
C 2 > 0, such that 

P (-Da; > Cix) ^ exp(—C 2 x), asx^oo. (75) 

Combining Lemma [8] and the fact that W\ —>■ oo as A —>• 1, we have that there exists 7 > 0, such 
that 

P (D(i_p)aw^ >(b + 2a) Wa) exp(-7WA) (76) 

as A —1. We have that 


F(S(0,aW;,)<-bW^) 

>P (^\^AxaWx < ‘^clW\^ n {D(i_p)avcA — T 2a)W\'j'j 

{AxaWx < 2«^a) P {D(l-p)aWx > (^ + 2^) ^Pa) 

(f>) 

>P (^AxaWx ^ 2AaVLA) P {D(^i_p^aWx — {b + 2a)Wx) 

~ 2^ — (^ + 2a) Wx) 

(d) 

>exp(-7VLA), (77) 

as A—)• 1, where step (a) follows from the independence between ^aoWa aiid D(^i_p)aWxJ (^) from 
the fact that A < 1, (c) from the Markov’s inequality, and (d) from Eq. (l76)) . This proves Eq. ()4|). 
and hence Lemma [H Q.E.D. 

A.4. Proof of Lemma [5] 

Proof. Eor Claim 1, observe that each of the event concerns only the behavior of the arrival and 
service token processes over an interval, and that these intervals are disjoint from each other. Claim 
1 follows by noting that both A and S are Poisson processes and hence memoryless. Eor Claim 
2, because the policy has access to a lookahead window of length IEa, the queue length at time t 
is hence Ft+Wx measurable, where F is the natural filtration induced by the input processes. The 
claim follows again from the memoryless property of Poisson processes. Claim 3 follows from the 
same arguments as for Claim 2. Q.E.D. 

A.5. Proof of Lemma [6] 

Proof. Consider the sequence of optimal stationary policies, {tta}. Let </> be defined as in Eq. fl34)l . 
Eix 4>>0, and let 

K = U 3 + cj)Wx = ik + (f + 2)Wx, (78) 

where step (a) follows from the fact that U 3 = B + 2Wx and B = kWx- The main idea for the 
proof is based on the following observation: conditional on the queue length process, Q{f), 
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would have reached zero before time K, even if no diversion had been made in (0, K\ (illustrated in 
Figure [3l). Therefore, each diversion made in {Ui^U 2 \ will necessarily lead to a waste service token 
in (0,-fC], and hence 

p {J{K) > tB I nti £^) > p (T > rfi I nti £.) ■ (79) 

We next give a lower bound on the above probability, as follows: 

^{j{K)>TB\£^r\£ 2 ) 

>P {J{K)>TB,Cl^^£,\£,r^£ 2 ) 

=p {J{K) > tB I nti £^) P (ntgf. I n £1) 

> p {y>tB\ nti £^) P I n £ 2 ) 

=P (T>r5,nt3^,|TinT2) 

=¥ (Ts) p (y > tB, ^3 n ^41 ^^1 n £ 2 ) 

>P(^5) (P(y >rfi|£:in<f2)+P(^3|<?in<f2)+P(^4|^in<f2) -2) 

>P(^5) (p {y>tB\£, n£2) +P(^31 £in£2) + ~^ 

(^ 5 ) (p {Y>TB\£,n £ 2 ) + P (T 3 ) + - 1 ) (80) 

where step (a) follows from Eq. (f79|) . and (6) and (c) from the independence between £^ and 

n£’ 2 , and between £3 and £ir\£ 2 , respectively (Lemma[5]). We have also used the inequality that 

P (j 4 n i?) > P (^) + P(i?) — 1, for any events A and B. 

By Claim 3 of Lemma [3l we have that 

limP(£^3) = limP(^4) = 1. (81) 

A-s-l A—>1 

Combining the assumption (Eq. dH])) 

liminfP (y > rB I Ti nT 2 ) = 9 > 0 (82) 

with Eqs. (l80|l and (fSTj) . we have that there exists A G (0,1), such that 

P (J(^) > rB I n £ 2 ) >P (£ 5 ) F(Y>TBj£in £ 2 ) 

>¥{£,) q/2, (83) 

for all A G (A, 1). We have that 


P {J{K)) >tB • P {J{K) > tB) 
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>tB ■ P {J{K) > tB, n £2) 

=tB • P {J{K) > rS I n ^2) • F {£1 n £2) 


(a) 


)^BF{£,)F{£,n£2) 

>BF{£,) 

(c) 

'^Bexp{-'yWx), 


(84) 


for some 7 > 0, as A —>■ 1, where step (a) follows from Eq. (|83D . ( 6 ) from Claims 1 and 2 of Lemma 
[3] and the independence of the events £i and £2 (Claim 1 of Lemma [5|), i.e., that 


F{£,n£2)=F{£,)F{£2)>^e, 


(85) 


and (c) from Lemma 01 This proves Lemma [H by setting a = k + (f> + 2. Q.E.D. 

A.6. Proof of Lemma [7] 

Proof. Let Tq = [— 1,Wa]^. We will show that Tq is compact under the metric ||x — y\\g = 
yi\. If this is true, it is not difficult to show that, for any a G R+, the set {x € X : 
Xi < a} = {0,..., [aj} X To is also compact under || • ||g, and our second claim follows. Note that 
a compact metrizable space is Polish, and it is easy to show that is Polish under the li norm. 
Our first claim thus also follows from the compactness of Tg, by observing that the product of two 
Polish spaces remains Polish. 

We now show the compactness of Tg. It suffices to show that any sequence in Tg, admits 

a sub-sequence that converges to a point in Tg. We will construct such a limiting point coordinate- 
by-coordinate, as follows. Because x\ is an element of the compact interval [—1,ILA] for all z G N, 
there exists yi G [—1,ITa] and an increasing sequence, C N, such that = yi. 

We now apply the same reasoning for progressively larger values of k: there exist yk G [—1, VLa] and 
for k = 2,3 ,..such that, for every k>2, is a sub-sequence of and 



( 86 ) 


Fix k>2. Because is a sub-sequence of for dllm<k — l, Eq. (I 86 )l further implies 

that 



(87) 


or, equivalently, that 


k 



( 88 ) 
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Let y be the element of Xq whose coordinates are defined according to the above procedure. We 
argue that y is the limiting point for some sub-sequence of For every A: S N, there exists 

j{k) G N, such that for all j > j{k), 


OO 



(b) 1 

<- + (1 + VFa)2-(^-2), 
K 


+ (1 + Wa)2-('=-") 


(89) 


where step (a) follows from the fact that 
Eq. (|55|) . Dehne 


1/m-K 


< 2{Wx + 1) for all m G N, and step (b) from 


vI = : 1 < m < fc}, V/c G N. 


(90) 


By Eq. (|89l) . we have that 


y-x- 


+ + VfcGN, 

9 k 


(91) 


Therefore, {x" is a sub-sequence of and it converges to ?/ as A: —)• oo under the metric 


g. This proves that Xq is compact. Q.E.D. 

















